Comparing Rule-based and Statistical MT Output

نویسنده

  • Gregor Thurmair
چکیده

This paper describes a comparison between a statistical and a rule-based MT system. The first section describes the setup and the evaluation results; the second section analyses the strengths and weaknesses of the respective approaches, and the third tries to define an architecture for a hybrid system, based on a rule-based backbone and enhanced by statistical intelligence. This contribution originated in a project called “Translation Quality for Professionals” (TQPro) which aimed at developing translation tools for professional translators. One of the interests in this project was to find a baseline for machine translation quality, and to extend MT quality beyond it. The baseline should compare stateof-the-art techniques for both statistical packages and rulebased systems, and draw conclusions from the comparison. This paper presents some insights into the results of this work.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Can Statistical Post-Editing with a Small Parallel Corpus Save a Weak MT Engine?

Statistical post-editing has been shown in several studies to increase BLEU score for rule-based MT systems. However, previous studies have relied solely on BLEU and have not conducted further study to determine whether those gains indicated an increase in quality or in score alone. In this work we conduct a human evaluation of statistical post-edited output from a weak rule-based MT system, co...

متن کامل

Statistical Phrase-Based Post-Editing

We propose to use a statistical phrasebased machine translation system in a post-editing task: the system takes as input raw machine translation output (from a commercial rule-based MT system), and produces post-edited target-language text. We report on experiments that were performed on data collected in precisely such a setting: pairs of raw MT output and their manually post-edited versions. ...

متن کامل

An Evaluation of Statistical Post-Editing Systems Applied to RBMT and SMT Systems

Statistical post-editing (SPE) of the output produced by rule-based MT (RBMT) systems has been reported to produce extraordinary BLEU (and other automatic evaluation) score improvements. SPE has also been applied to the output of statistical MT (SMT) systems, albeit with more mixed results. We present a statistical post-editing pipeline and evaluate the outputs using automatic and human evaluat...

متن کامل

Machine Translation System for Patent Documents Combining Rule-based Translation and Statistical Postediting Applied to the NTCIR-10 PatentMT Task

In this article, we describe system architecture, preparation of training data and discussion on experimental results of the EIWA group in the NTCIR-10 Patent Translation Task. Our system is combining rule-based machine translation and statistical postediting. The thing about our new system compared with NTCIR-9 PatentMT task is to implement automatic selecting method from multiple translations...

متن کامل

Rule-Based Translation with Statistical Phrase-Based Post-Editing

This article describes a machine translation system based on an automatic post-editing strategy: initially translate the input text into the target-language using a rule-based MT system, then automatically post-edit the output using a statistical phrase-based system. An implementation of this approach based on the SYSTRAN and PORTAGE MT systems was used in the shared task of the Second Workshop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004